2 research outputs found
Measuring Interventional Robustness in Reinforcement Learning
Recent work in reinforcement learning has focused on several characteristics
of learned policies that go beyond maximizing reward. These properties include
fairness, explainability, generalization, and robustness. In this paper, we
define interventional robustness (IR), a measure of how much variability is
introduced into learned policies by incidental aspects of the training
procedure, such as the order of training data or the particular exploratory
actions taken by agents. A training procedure has high IR when the agents it
produces take very similar actions under intervention, despite variation in
these incidental aspects of the training procedure. We develop an intuitive,
quantitative measure of IR and calculate it for eight algorithms in three Atari
environments across dozens of interventions and states. From these experiments,
we find that IR varies with the amount of training and type of algorithm and
that high performance does not imply high IR, as one might expect.Comment: 17 pages, 13 figure
Adaptive Selection of the Optimal Strategy to Improve Precision and Power in Randomized Trials
Benkeser et al. demonstrate how adjustment for baseline covariates in
randomized trials can meaningfully improve precision for a variety of outcome
types. Their findings build on a long history, starting in 1932 with R.A.
Fisher and including more recent endorsements by the U.S. Food and Drug
Administration and the European Medicines Agency. Here, we address an important
practical consideration: *how* to select the adjustment approach -- which
variables and in which form -- to maximize precision, while maintaining Type-I
error control. Balzer et al. previously proposed *Adaptive Prespecification*
within TMLE to flexibly and automatically select, from a prespecified set, the
approach that maximizes empirical efficiency in small trials (N40). To avoid
overfitting with few randomized units, selection was previously limited to
working generalized linear models, adjusting for a single covariate. Now, we
tailor Adaptive Prespecification to trials with many randomized units. Using
-fold cross-validation and the estimated influence curve-squared as the loss
function, we select from an expanded set of candidates, including modern
machine learning methods adjusting for multiple covariates. As assessed in
simulations exploring a variety of data generating processes, our approach
maintains Type-I error control (under the null) and offers substantial gains in
precision -- equivalent to 20-43\% reductions in sample size for the same
statistical power. When applied to real data from ACTG Study 175, we also see
meaningful efficiency improvements overall and within subgroups.Comment: 10.5 pages of main text (including 2 tables, 2 figures) + 14.5 pages
of Supporting Inf